Closed Bug 1956453 Opened 2 months ago Closed 2 months ago

Crash in [@ mozilla::nsDisplayItem::GetPerFrameKey]

Categories

(Core :: Layout, defect)

defect

Tracking

()

VERIFIED FIXED
138 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox-esr128 --- unaffected
firefox136 --- unaffected
firefox137 --- unaffected
firefox138 + verified

People

(Reporter: aryx, Assigned: emilio)

References

(Blocks 2 open bugs, Regression, )

Details

(5 keywords, Whiteboard: [viewtransitions:m1], [bugmon:bisected,confirmed], [wptsync upstream])

Crash Data

Attachments

(2 files)

17 crashes from 10 installs of Firefox 138.0a1 20250325210233 on macOS and Windows.

Emilio, any idea what started this?

Crash report: https://crash-stats.mozilla.org/report/index/e98a3039-8b45-4bbb-bdb8-953cc0250326

Reason:

EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames:

0  xul.dll  mozilla::nsDisplayItem::GetPerFrameKey() const  layout/painting/nsDisplayList.h:2156
0  xul.dll  mozilla::MergeState::ProcessItemFromNewList(mozilla::nsDisplayItem*, mozilla:...  layout/painting/RetainedDisplayListBuilder.cpp:464
0  xul.dll  mozilla::RetainedDisplayListBuilder::MergeDisplayLists(mozilla::nsDisplayList...  layout/painting/RetainedDisplayListBuilder.cpp:835
1  xul.dll  mozilla::MergeState::MergeChildLists(mozilla::nsDisplayItem*, mozilla::nsDisp...  layout/painting/RetainedDisplayListBuilder.cpp:509
1  xul.dll  mozilla::MergeState::ProcessItemFromNewList(mozilla::nsDisplayItem*, mozilla:...  layout/painting/RetainedDisplayListBuilder.cpp:481
1  xul.dll  mozilla::RetainedDisplayListBuilder::MergeDisplayLists(mozilla::nsDisplayList...  layout/painting/RetainedDisplayListBuilder.cpp:835
2  xul.dll  mozilla::RetainedDisplayListBuilder::AttemptPartialUpdate(unsigned int)  layout/painting/RetainedDisplayListBuilder.cpp:1667
2  xul.dll  nsLayoutUtils::PaintFrame(gfxContext*, nsIFrame*, nsRegion const&, unsigned i...  layout/base/nsLayoutUtils.cpp:3169
3  xul.dll  mozilla::PresShell::PaintInternal(nsView*, mozilla::PaintInternalFlags)  layout/base/PresShell.cpp:6683
4  xul.dll  nsViewManager::ProcessPendingUpdatesPaint(nsIWidget*)  view/nsViewManager.cpp:399
Flags: needinfo?(emilio)

Obvious candidate would be bug 1955697. Bug 1955697 comment 16 is what landed the riskier changes, and that seems to have made it to 20250325210233.

There's an old crash with this signature in ESR128 but it should not be related.

Took a quick look and also there's no obvious pattern in the crashes (with regard to URIs and such). So my preference would be to try to get a repro from fuzzers over the next few days and if we can't get it on time back out from beta. I expect it to be a trivial-ish change...

An alternative, also related bug could be bug 1953823 (though that would need a non-default pref to be on...)

Flags: needinfo?(emilio)
Keywords: testcase-wanted
See Also: → 1924907

The bug is marked as tracked for firefox138 (nightly). We have limited time to fix this, the soft freeze is in a day. However, the bug still isn't assigned.

:fgriffith, could you please find an assignee for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit BugBot documentation.

Flags: needinfo?(fgriffith)

See comment 1.

Assignee: nobody → emilio
Flags: needinfo?(fgriffith)

Thanks Emilio

Blocks: 1956391

This has been reported by live site testing and fuzzing. I will attach a reduced test case shortly.

Attached file testcase.html
Flags: in-testsuite?

Amazing, thanks! A pernosco recording would be awesome but otherwise I can look at this first thing tomorrow.

Flags: needinfo?(emilio)
Keywords: pernosco-wanted

Verified bug as reproducible on mozilla-central 20250326192415-33af93ce40ec.
The bug appears to have been introduced in the following build range:

Start: c6ac35e2fa35adaac9047ed6123d616f6d3fdf5c (20250325081146)
End: d926f7fefc9dc9eab1d2e65a87ab2883c5ef5143 (20250325091625)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=c6ac35e2fa35adaac9047ed6123d616f6d3fdf5c&tochange=d926f7fefc9dc9eab1d2e65a87ab2883c5ef5143

Whiteboard: [bugmon:bisected,confirmed]

Painting --- after optimization:
SolidColor p=0x10b2a25c0 f=0x10b22c020(Viewport(-1))
CompositorHitTestInfo p=0x10b2a2020 f=0x10b22c1b8(ScrollContainer(html)(-1))
AsyncZoom p=0x10b2a23f0 f=0x10b22c1b8(ScrollContainer(html)(-1))
CompositorHitTestInfo p=0x10b2a20d0 f=0x10b22c0d8(Canvas(html)(-1))
SolidColor p=0x10b2a21c8 f=0x10b22c0d8(Canvas(html)(-1))
CanvasBackgroundImage p=0x10b2a22e0 f=0x10b22c0d8(Canvas(html)(-1))

Painting --- Modified list (dirty 0,0,0,0):
SolidColor p=0x10b2a2b08 f=0x10b22c020(Viewport(-1))
CompositorHitTestInfo p=0x10b2a2690 f=0x10b22c1b8(ScrollContainer(html)(-1))
AsyncZoom p=0x10b2a2990 f=0x10b22c1b8(ScrollContainer(html)(-1))
SolidColor p=0x10b2a28d0 f=0x10b22c0d8(Canvas(html)(-1))
CompositorHitTestInfo p=0x10b2a2730 f=0x10b22c0d8(Canvas(html)(-1))
CanvasBackgroundImage p=0x10b2a27d0 f=0x10b22c0d8(Canvas(html)(-1))

The SolidColor and CompositorHitTestInfo items change places. If the order of items changes then there needs to be an invalidate marking these display items as needing to be rebuilt. I would guess no such invalidate is happening. The new code probably needs to be more careful about keeping the same order.

From the pernosco, the first time drawing we add solid color item here

https://searchfox.org/mozilla-central/rev/a920f0f657f8f2d81ef53c581433942059863dd5/layout/generic/nsCanvasFrame.cpp#438

The second time through we add the solid color item here

https://searchfox.org/mozilla-central/rev/a920f0f657f8f2d81ef53c581433942059863dd5/layout/generic/nsCanvasFrame.cpp#546

I'm guessing that the value of mCSSSpecified here

https://searchfox.org/mozilla-central/rev/a920f0f657f8f2d81ef53c581433942059863dd5/layout/generic/nsCanvasFrame.cpp#436

is changing without an invalidate. Probably after the image is loaded and decoded we set it to false here

https://searchfox.org/mozilla-central/rev/a920f0f657f8f2d81ef53c581433942059863dd5/layout/painting/nsCSSRendering.cpp#2454

and it's set to true before the image is decoded. mImage.IsOpaque() starts returning true because it calls imageContainer->WillDrawOpaqueNow() which checks if there is a decoded frame, and that decode was the one that painting requested, but the notification for that decode is still pending so we haven't invalidated.

I haven't looked at the new code in detail, but can we just put the solid color item in the same place always?

Based on comment #8, this bug contains a bisection range found by bugmon. However, the Regressed by field is still not filled.

:emilio, if possible, could you fill the Regressed by field and investigate this regression?

For more information, please visit BugBot documentation.

Flags: needinfo?(emilio)

The bug is linked to a topcrash signature, which matches the following criterion:

  • Top 10 desktop browser crashes on nightly

For more information, please visit BugBot documentation.

Keywords: topcrash
Flags: needinfo?(emilio)
Regressed by: 1955697

I haven't looked at the new code in detail, but can we just put the solid color item in the same place always?

No, because the color might need to be omitted for blending correctness. However in this case there's no blending so we should put it in the same place here...

Oh, I see what's going on, I think. That code inserts the canvas background at the bottom of the BorderBackground() list, but that might not be right if the list wasn't empty at the beginning... Or something along those lines.

Flags: needinfo?(emilio)

Instead of conditionally inserting the item at the bottom of the
BorderBackground() list, just suppress its color.

Flags: needinfo?(emilio)

hoping we can land this as soon as possible since it is a top crasher and we are in soft code freeze.... so adding a NI for patch review

Flags: needinfo?(mstange.moz)

I didn't realize this patch was a crash fix, the commit message made it sound like a refactoring. Thanks for the needinfo, looking now.

Flags: needinfo?(mstange.moz)
Pushed by ealvarez@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d1cbe9636c3a Simplify how we suppress the canvas background color. r=mstange
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/51669 for changes under testing/web-platform/tests
Whiteboard: [bugmon:bisected,confirmed] → [bugmon:bisected,confirmed], [wptsync upstream]
Status: NEW → ASSIGNED
Whiteboard: [bugmon:bisected,confirmed], [wptsync upstream] → [viewtransitions:m1], [bugmon:bisected,confirmed], [wptsync upstream]
Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 138 Branch

Verified bug as fixed on rev mozilla-central 20250328212938-f31082d6f90f.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Status: RESOLVED → VERIFIED
Keywords: bugmon
Upstream PR merged by moz-wptsync-bot
No longer depends on: grizzly, site-scout
Regressions: 1964293
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: